Add TinyOpenFold: GPU Optimization Tutorial with AlphaFold 2 Evoformer#164
Open
asitav wants to merge 41 commits into
Open
Add TinyOpenFold: GPU Optimization Tutorial with AlphaFold 2 Evoformer#164asitav wants to merge 41 commits into
asitav wants to merge 41 commits into
Conversation
… exercises/ README file.
…hon for Python call stack profiling. Updated default parameters for batch size and sequence length to optimize output size. Enhanced README with detailed usage instructions and output file descriptions.
- Replace manual wheel downloads with pip install from nightly repository - Update requirements.txt with new PyTorch versions - Simplify installation process - Update run_rocprof_sys.sh file.
…ling - Add automatic detection of rocpd package availability - Conditionally enable ROCPROFSYS_USE_ROCPD only if rocpd is found - Set ROCPROFSYS_CONFIG_FILE only if ~/.rocprof-sys.cfg exists - Add --trace flag to rocprof-sys-python command - Update help text with accurate configuration information
- Updated venv names from venvOF/venvOFr711 to simple venv - Changed ROCm module from 7.1.1 to 7.2 (PyTorch still uses ROCm 7.1 nightly) - V2 README now references main README for environment setup to avoid duplication - Updated requirements.txt with current dependencies
- Add pip install command for requirements_rocprof-compute-develop.txt in README.md - Include new requirements file for rocprof-compute development dependencies - Source: https://github.com/ROCm/rocm-systems/blob/develop/projects/rocprofiler-compute/requirements.txt
Includes comprehensive tutorial docs and automated test script for demonstrating progressive optimization from baseline PyTorch to custom Triton kernels.
- Optimized FLOPS_ANALYSIS.md for conciseness - Removed redundant files and scaling scripts - Removed exercises directories from v2 and v3 - Updated documentation references
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds TinyOpenFold, a comprehensive educational example demonstrating GPU optimization techniques on AMD GPUs. The tutorial progressively implements an AlphaFold 2 Evoformer architecture from baseline PyTorch to custom Triton kernels.
Key Features
Three Progressive Optimization Stages:
Comprehensive Profiling Integration:
Complete Educational Pipeline:
What's Included
Problem Sizes
The tutorial demonstrates optimization across different problem sizes:
Educational Value
This example teaches:
Testing
--validate-setup)Documentation
PERFORMANCE_OPTIMIZATION_TUTORIAL.md) with step-by-step guideARCHITECTURE.md)optimization_tutorial.sh)Target Audience
Related
This example complements existing HPC Training Examples by providing:
Ready to merge: All code tested, documentation complete, and profiling integration verified.